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CLEAVAGE OF CAULOBACTER PRODUCED 
RECOMBINANT FUSION PROTEINS 



5 FIELD OF INVENTION 

This invention relates to the expression and secretion of recombinant fusion 
proteins from Caulobacter wherein a heterologous polypeptide is fused with all or part 
of the surface layer protein (S-Iayer protein) of the bacterium. 

10 

BACKGROUND OF THE INVENTION 

Many bacteria assemble layers composed of repetitive, regularly aligned, 
proteinaceous sub-units on the outer surface of the cell. These layers are essentially 

15 two-dimensional paracrystalline arrays, and being the outer molecular layer of the 
organism, directly interface with the environment. In Caulobacter , the S-layer protein 
is synthesized by the cell in large quantities and the S-layer completely envelops the cell 
and thus appears to be a protective layer. 

Caulobacter are natural inhabitants of most soil and freshwater environments 

20 and may persist in waste water treatment systems and effluents. The bacteria alternate 
between a stalked cell that is attached to a surface, and an adhesive motile dispersal cell 
that searches to find a new surface upon which to stick and convert to a stalked cell. 
The bacteria attach tenaciously to nearly all surfaces and do so without producing the 
extracellular enzymes or polysaccharide "slimes" that are characteristic of most other 

2 5 surface attached bacteria. Caulobacters have simple requirements for growth. The 

organism is ubiquitous in the environment and has been isolated from oligotrophic to 
mesotrophic situations. They are known for their ability to tolerate low nutrient level 
stresses, for example, low phosphate levels. 

All of the freshwater Caulobacter that produce an S-layer are suiiilar and have 

3 0 S-layers that are substantially the same under election microscopy. The layers are 

hexagonally arranged in all cases, with a similar centre-centre dimension (see: Walker, 
S.G., et al... (1992). "Isolation and Comparison of the Paracrystalline Surface Layer 
Proteins of Freshwater Caulobacters " J. Bacteriol. 174: 1783-1792). 



WO 00/04170 



-2- 



PCT/CA99/00637 



16S rRNA sequence analysis of several S-layer producing Caulobacter strains 
show that they group closely (see: Stahl, D A. et aL (1992) "The Phylogeny of Marine 
and Freshwater Caulobacters Reflects Their Habitat" J. Bacteriol. 174: 2193-2198). 
DNA probing of Southern blots using the S-layer gene from C. crescentus CB15 
5 identifies a single band that is consistent with the presence of a cognate gene (see: 
MacRae, J.D. and, J. Smit. (1991) "Characterization of Caulobacters Isolated from 
Wastewater Treatment Systems" Applied and Environmental Microbiology 57:751- 
758). Furthermore, antisera raised against the S-layer protein of CB15 reacts against 
the S-layer protein of other Caulobacter (see: Walker, S.G. et al. (1992) [supra]). All 

10 S-layer proteins isolated from Caulobacter may be substantially purified using the same 
methods. All strains appear to have a polysaccharide species which may be required 
for S-layer attachment (see: Walker, S.G. eta!. (1992) [supra]). 

The S-layers elaborated by freshwater isolates of Caulobacter are visibly 
indistinguishable from the S-layer produced by Caulobacter strains CB2 and CB15. 

15 The S-layer proteins from the latter strains have approximately 100,000 m.w. although 
sizes of S-layer proteins from other species and strains will vary. The hydrophillic S- 
layer protein has been characterized both structurally and chemically. It is composed of 
ring-like structures spaced at 22 nm intervals arranged in a hexagonal manner on the 
outer membrane. The S-layer is bound to the bacterial surface and may be removed by 

2 0 low pH treatment or by treatment with a calcium chelator such as EDTA. 

The similarity of S-layer proteins in different strains of Caulobacter permits the 
use of a cloned S-layer protein gene of one Caulobacter strain for retrieval of the 
corresponding gene in other Caulobacter strains (see: Walker, S.G. et al. (1992) 
[supra]; and MacRae, J.D. et al. (1991) [supra]). 

25 Expression of a heterologous polypeptide as a fusion product with the S-layer 

protein of Caulobacter provides advantages not previously seen in systems for 
production of recombinant fusion proteins using other organisms such as E. coli and 
Salmonella . All known Caulobacter strains are believed to be harmless and are nearly 
ubiquitous in aquatic environments. Li contrast, many Salmonella and E. coli strains 

30 are pathogens. Consequently, expression and secretion of a heterologous polypeptide 
using Caulobacter as a vehicle has the advantage that the expression system will be 
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stable in a variety of outdoor environments and may not present problems associated 
with the use of a pathogenic organism. Furthermore, Caulobacter are natural biofllm 
forming species and may be adapted for use in fixed biofilm bioreators. The quantity of 
S-layer protein that is synthesized and is secreted by Caulobacter is high, reaching 12% 
5 of the cell protein. 

There is an existing need to produce pure proteins and peptides in an economical 
manner and in a manner that minimizes or simplifies the purification steps needed after 
fermentation. Key commercial areas include the production of recombinant human and 
animal therapeutic antibiotic and vaccine peptides, industrial enzymes, protein 

10 polymers, and antibacterial enzymes for foodstuffs. Many of these commercial 
applications require low production costs and there are few expression systems available 
that can meet such cost restraints. In addition, there are numerous research applications 
where rapid methods to produce and purify proteins are needed to facilitate the 
discovery stage. This is especially true where there is a desire to express a large 

15 number of proteins with unknown function (from a collections of cloned cDNA's, for 
example) or a large number of variants of a single protein, (for example, resulting from 
site directed mutagenesis) in a search for variants with improved properties. 

Generally, proteins must be secreted to be produced at low cost. The primary 
reason is the much reduced cost of purification of the target protein from cell material. 

20 However, even for secreted proteins, simple methods of separating the product from 
spent culture and cells are important for cost reduction and ease of use. 

An international patent application published as WO 97/34000 on September 18, 
1997 describes the expression and secretion of recombinant proteins from Caulobacter 
in which the recombinant protein is a fusion of all or part of Caulobacter S-layer protein 

25 with a heterologous protein of interest (also see: Bingle, W.H., et al. 1997 1 "Linker 
Mutagenesis of the Caulobacter us S-layer protein: Toward a Definition of an N- 
terminal Anchoring Region and a C-terminal Secretion Signal and the Potential for 
Heterologous Protein Secretion". J. Bacterid. 179:601-611). 

The Caulobacter S-layer secretion apparatus is in the category of "Type 1" 

3 0 secretion usually found in pathogenic bacteria and noted for its ability to secrete a wide 
variety of proteins including large and hydrophiilic proteins. The Caulobacter protein 
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secretion system is particularly useful to secrete recombinant proteins. 

The Caulobacter S-layer Type 1 secretion pathway requires only a C-terminal 
secretion signal, typically comprising about 200 amino acids at the end of the protein. 
The export mechanism is capable of tolerating a wide variety of foreign proteins. 
5 Recombinant proteins may be conveniently produced as fusion proteins with the target 
protein being fused to the C-terminal secretion signal. Depending on the application, it 
may be desirable to remove the secretion signal following secretion. Not removing the 
secretion signal may be an approach suitable for many subunit vaccine applications, 
where the remaining S-layer protein serves as a carrier. 

10 A unique and desirable feature of fusion proteins produced by the Caulobacter 

S-layer protein secretion system is that they form insoluble aggregates in the culture 
medium. This is apparently a consequence of the S-layer sequences associated with 
secretion signal and reflects the fact that the protein normally self-assembles into a two 
dimensional crystalline layer on the bacterium's surface. These aggregates are visible 

15 to the naked eye and are readily collected by simple filtration. With simple water wash 
steps, residual bacterial cells are readily flushed away. It is routinely possible to 
achieve a protein purity of 90% or better with this simple purification procedure. 

DESCRIPTION OF THE PRIOR ART 

20 

Most current protein purification systems for recombinant proteins produced by 
bacteria rely upon an affinity matrix to achieve separation of the target protein and to 
concentrate the protein for subsequent steps of purification. To accomplish this, genes 
for recombinant proteins are commonly constructed so that they contain affinity tags, 
25 which are protein sequences that will bind to an affinity matrix. Commonly used 
systems include the following: 

(a) glutathione S-transferase (GST) tag, which binds to glutathione-sepharose 
matrices; 

30 

(b) maltose binding protein (MBP) tag, which binds to amylose matrices; 
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(c) multiple tandem histidine residues (e.g. "His-6") tag, which binds to 
nickel-derivatized solid matrices; and 

5 (d) protein A tag, which binds to Immunoglobulin IgG-derivatized sepharose or 
comparable matrices. 

Prior art techniques were typically developed so that removal of a target protein 
does not disrupt the tag and matrix association. Instead, enzymes that cleave specific 

10 sequences of amino acids are employed. The enzyme cleavage sequence is positioned 
between the tag and the desired recombinant protein and enzymatic cleavage is effected 
directly on the matrix with attached fusion protein. If a secretion signal is used, the 
cleavage site is usually positioned such that the secretion signal is separated from the 
target recombinant protein during the cleavage step. The matrix is regenerated for re- 

15 use only after the target recombinant protein has been purified away from the matrix. 
Typical enzymes used in these methods are Factor Xa, enterokinase and collagenase. 

Chemical cleavage is generally not used because the conditions required for 
cleavage will disrupt the binding of affinity tag and matrix or destroy the matrix. When 
chemical cleavage is used with recombinant fusion proteins to cleave target protein from 

20 a secretion signal and/or affinity tag, solubilization and denaturation processes are 
generally employed. The expectation is that complete or nearly complete unfolding of 
the protein is a prerequisite for effective cleavage. 

Mild-acid cleavage is predicated on the inclusion, by happenstance or design, of 
the acid-sensitive aspartate-proline dipeptide at a desired site for cleavage. The protein 

25 to be cleaved is typically exposed to conditions that solubilize and/or completely 
denature the protein prior to cleavage. The chaotropic agent guanidine hydrochloride 
(used at 6-7 M) is commonly employed to denature and solubilize the protein prior to, 
or at the same time as acid treatment. Alternately, high concentrations of acids that also 
serve as solubilizing agents (as examples: 70-90% formic acid, acetic acid [10%] 

3 0 pyridine, or relatively high concentrations of HCL (60 mM or more) are employed. 
Because such conditions would disrupt a tag/affinity matrix association, direct cleavage 
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of an affinity tag from the target protein while a protein remains associated with an 
affinity matrix is not attempted. 

General conditions for cleavage at aspartate - proline sites are described in 
5 Current Protocols in Molecular Biology (supp. 28; chapter 16.4) John Wiley & Sons 
Inc. 1994, and in Landon, M. "Cleavage at Aspartyl - Prolyl Bonds" in Methods in 
Enzymology (1977) 47; 145-149. These references suggest that significant variability 
of cleavage conditions exist for different proteins and that cleavage might occur in some 
instances without first denaturing or solubilizing the protein. However, in practice, the 

1 o latter circumstances are rare and proteins to be subjected to acid cleavage at Asp-Pro 
dipeptides are usually solubilized to a state where there is no visible turbidity. Such 
solubilized protein will normally not pellet when centrifuged at 100,000 x g f or 1 hour. 
It is now shown that mild-acid conditions may be used for cleavage of aspartate-proline 
sites in Caulobacter S-layer fusion proteins without placing the protein in a solubilized 

15 state as described above. 

SUMMARY OF INVENTION 

This invention is based on the unexpected discovery that recombinant fusion 
20 proteins produced by the Caulobacter S-layer protein secretion system can be cleaved 
under mild-acid conditions and solubilization of the fusion protein is not required. 
Cleavage may be accomplished while the fusion protein is in the form of an insoluble 
aggregate typical of the Caulobacter S-layer protein. Cleavage occurs at aspartate- 
protein dipeptides which may be in a heterologous protein portion of the fusion protein 
25 or in a portion that is native to the Caulobacter S-layer portion. The dipeptide may be 
placed at a desired location for cleavage by engineering DNA encoding the fusion 
protein to express the dipeptide at the desired location. A preferable location for 
cleavage may be at or near the junction between a heterologous (target) protein and the 
Caulobacter S-layer portion comprising the Caulobacter secretion signal, such that a 
30 cleavage product will be the target protein in its entirety and substantially free of 
extraneous amino acids. 
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The current invention makes it possible to cleave a heterologous (target) protein 
from the S-layer protein portion using only mild-acid conditions, even while the fusion 
protein is in an aggregated form. These cleavage conditions do not result in significant 
solubilization of the S-layer protein portion. 

5 

This invention provides a method of cleaving a fusion protein including a first 
component which comprises all or part of a Caulobacter S-layer protein including a 
Caulobacter C-terminal secretion signal, and a second component heterologous to 
Caulobacter. The fusion protein contains at least one aspartate-proline dipeptide. The 

10 method comprises combining the fusion protein with an acid solution of a strength 
insufficient to solubilize the fusion protein for a time sufficient for cleavage of the 
fusion protein at the aspartate-proline dipeptide. The acid solution may have a pH of 
from about 1.5 (eg. 1.5 ± 0.1) to about 2.5 (eg. 2.5 ± 0.1), and preferably from about 
1.65 (eg. 1.65 ± 0.05) to about 2.35 (eg. 2.35 ± 0.05). Preferred pH conditions may 

15 be achieved using an acid equivalent in the range of about 5 to about 20 mM HCL. 
The method is typically carried out at a temperature in the range of approximately room 
temperature to about 50°C. 

This invention also provides a method of preparing a DNA construct suitable for 
expression of a fusion protein suitable for use in the method of this invention. The 

20 method comprises joining an upstream DNA segment including DNA heterologous to 
Caulobacter which includes a protein of interest to a downstream DNA segment 
including DNA for a Caulobacter C-terminal secretion signal which does not encode an 
aspartate-proline dipeptide. The upstream segment contains DNA encoding an 
aspartate-proline dipeptide at or near the junction between said upstream and 

2 5 downstream segments . 

This invention also provides a method of preparing a fusion protein, comprising 
the steps of expressing a DNA construct as described above in Caulobacter and 
recovering said fusion protein once secreted by the Caulobacter. 

Once cleavage is accomplished according to this invention, the S-layer portion 

30 comprising the Caulobacter secretion signal may remain as an insoluble aggregate. If 
the target protein is soluble, the S-layer portion may be easily separated from the target 
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recombinant protein by simple centrirugation or filtration methods. Thus the system of 
this invention facilitates separation as would a Tag/affinity matrix system except that 
here, the system is also the means for producing an insoluble matrix. In addition, the 
insoluble matrix produced by this invention is resistant to the effects of the acid 
5 treatment, allowing direct cleavage of the target recombinant protein. In this way, a 
very inexpensive chemical cleavage method can be employed to economically retrieve 
recombinant proteins from a bacterial fusion protein. In contrast to the cost of most 
affinity matrices, there is little expense associated with the use of the S-layer secretion 
signal as it is simply a part of the fermentation/secretion process. 

o 

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION 



Production of Recombinant Fusion Proteins Using 
the Caulobacter S-layer Secretion System 

15 

Proteins may be produced using the Caulobacter S-layer Type 1 secretion 
pathway which requires only the C-tenninal secretion signal of the Caulobacter . This 
signal is the C-terminal portion of the S-layer protein, which typically comprises about 
200 amino acids. (See: Bingle, etal (1997) [supra]; and, WO 97/34000). Additional 

20 Caulobacter S-layer DNA upstream from the secretion signal may also be present and 
may be desirable to encode portions of the S-layer protein which will contribute to 
aggregate formation of the secreted protein. Such additional Caulobacter DNA may 
constitute most or all of the remainder of the DNA encoding the S-layer protein. 

Standard techniques (such as methods described in WO 97/34000) may be used 

25 to identify the amount of the C-terminal portion of a particular Caulobacter S-layer 
protein which functions as the secretion signal. 

Creation of fusion proteins is commonly done by preparing DNA which codes 
for the target protein and fusing it in-frame with the C-terminal region of the S-layer 
gene. There are numerous possible methods, with the following being examples. 

30 1. Oligonucleotide Chemical Synthesis. This involves the design of 
complementary single strands, complete with desirable restriction endonuclease cut sites 
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at the ends, chemical synthesis of the strands followed by annealing, cloning into a 
plasmid vector, juxtaposed to an appropriate portion of the C-terminal region of the S- 
layer gene. 

2. Production of the Target Gene DNA by Polymerase Chain Reaction (PCR) 
5 Amplification of a Target Sequence. In this case, appropriate in-frame restriction 

sites are incorporated into the short oligonucleotides used for amplification of a target 
sequence, such that the final PCR product can be treated with the appropriate restriction 
enzymes (to create the restriction site "sticky ends"), followed by cloning into a plasmid 
vector, juxtaposed to an appropriate portion of the C-terminal region of the S-layer 
io gene. 

3, Adapting Restriction Endonuclease Cleavage Sites that are Native to a 
Target Protein Gene Sequence for Fusion to the DNA Coding for the C-terminal S- 
layer Secretion Signal to Accomplish In-frame Expression of a Chimeric Protein. 

15 This can be accomplished by direct ligation (although it is uncommon that an 
appropriate match will occur), or the use of adapter sequences or methods involving 
blunting of a restriction site and subsequent blunt-end ligation to change expression 
reading frame or join unlike restriction site sticky ends. 

There will be numerous convenient sites for fusion with the C-tenninal regions 

20 of the S-layer that lead to the successful expression, secretion and aggregation of a 
recombinant fusion protein. Some example positions are at or near the DNA sites 
corresponding to amino acids 622, 690, 784, 892 and 907 of the C. crescentus S-layer 
gene (see: Appendix 1 and, WO 97/34000). Other sites of fusion with the S-layer gene 
may also be employed. Most often a plasmid vector is designed such that the C- 

2 5 terminal gene segment is resident on a plasmid with appropriate restriction sites placed 

at the N-terminal junction of the S-layer fragment. Target recombinant protein gene 
segments are then cloned into those restriction sites. It is typical to prepare initial 
plasmid constructs that are replicated in E.coli . After a construct is produced, it is 
typically transferred to a broad host range plasmid which can then be introduced into 

3 0 the appropriate Caulobacter strain by electroporation. Suitable broad host range 

plasmids can be constructed from (but are not limited to) the IncQ, IncW and IncPl 
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plasmid incompatibility groups. 

The introduction of the aspartate-proline (Asp-Pro) dipeptide at the appropriate 
site in the fusion protein can be done in several ways. Some examples are: 

5 (a) incorporating a DNA sequence necessary to express the Asp-Pro 

dipeptide into the oligonucleotides used to prepare the target sequence, either by 
oligonucleotide synthesis or PCR methods; 

(b) preparing a DNA segment with appropriate restriction sites at the termini 
10 so that an Asp-Pro dipeptide can be introduced (most often at the junction between S- 

layer and target gene) after a fusion recombinant S-layer gene has been made; and 

(c) use of a native Asp-Pro dipeptide in either the target DNA or the S-layer 
segment (for example, an Asp-Pro dipeptide is located at amino acids 692 and 693 of 
the C. crescentus S layer gene and is suitable for fusions made at the amino acid site). 

15 The methods described above are not the only methods that may be used for 

creating and expressing fusion recombinant S-layer proteins, nor is it necessary to have 
the engineered genes resident on a plasmid. For example, the expressed gene may be 
introduced into the chromosome (using well-known gene insertion or replacement 
techniques) and still achieve secretion of the recombinant proteins (see WO 97/34000). 

2 0 In some cases it may be desirable to produce recombinant fusion proteins as insertions 
of heterologous DNA in the middle of the S-layer gene. In such a case, Asp-Pro 
dipeptide sequences could be engineered at the N and C-termini of the target peptide. 

All possible codon combinations for Asp-Pro will work but the CCA codon for 
proline is not preferred due to the likelihood of a low amount of the corresponding 

25 tRNA being present in Caulobacter . The following is an approximate usage table for 
C. crescentus. 
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5 



Caulobacter crescentus Codon Usage Table 

(Amino Acid] [Triplet Code] [Frequency Per Thousand] 



10 



Phe UUU 
Phe UUC 
Leu UUA 
Leu UUG 


2.5 
27.0 
0.0 
4.4 


Ser UCU 
SerUCC 
SerCA 
SerUCG 


1.2 
8.5 
1.2 
25.7 


Try UAU 
Try UAC 
STOP UAA 
STOP UAG 


6.6 
9.6 
0.8 
0.6 


Cys UGU 
CysUGC 
CysUGA 
STOP UGG 


0.6 
5.5 
1.6 

7.2 


Leu CUU 
Leu CUC 
Leu CUA 
Leu CUG 


4.4 

15.7 

1.1 

72.3 


Pro CCU 
Pro CCC 
Pro CCA 
Pro CCG 


2.5 
15.5 
0.9 
27.1 


His CAU 
HisCAC 
Gin CAA 
Gin CAG 


3.2 
12.2 
3.7 
30.2 


ArgCGU 
Arg CGC 
ArgCGA 
ArgCGG 


7.6 
44.7 
3.0 
12.1 


HeAUU 
lie AUC 
lie AUA 
Met AUG 


2.4 
49.0 
0.3 
25.7 


ThrACU 
Thr ACC 
ThrACA 
ThrACG 


1.2 
37.3 
0.8 
16.8 


AsnAAU 
Asn AAC 
Lys AAA 
LysAAG 


4.1 
23.8 
2.7 
37.9 


Ser AGU 
Ser AGC 
Arg AGA 
Arg AGG 


0.8 
14.9 
0.4 
1.1 


Val GUU 
Val GUC 
Val GUA 
Val GUG 


5.4 
42.7 
1.0 
30.7 


AlaGCU 
AlaGCC 
AlaGCA 
AlaGCG 


9.5 
84.1 
2.2 
36.7 


AspGAU 
Asp GAC 
GluGAA 
Glu GAG 


11.1 
48.5 
20.5 
45.4 


Gly GGU 
Gly GGC 
Gly GGA 
Gly GGG 


9.5 
64.8 
2.3 
7.7 
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Large quantities (eg. 12% of total cell protein/3% of input organic carbon) of a 
wide range of proteins can be produced, with yields in the order of 250 mg/liter of 
batch culture. Fusion proteins with 35 kDa of target peptide are secreted with little 
5 difficulty, although proteins with multiple cysteines may be more difficult to express. 
Post-expression glycosylation of proteins does not occur, an advantage for most peptide 
expression applications. 

l o Host Expression Strains 

For secretion of recombinant fusion S-layer proteins, the Caulobacter strain will 
preferably be one which has lost the ability to produce a native S-layer protein, while 
retaining a fully functional S-layer protein secretion apparatus. Such strains may be 

15 obtained by screening for mutants that have spontaneously become S-layer protein 
negative; or, by directed genetic manipulation, such as (but not limited to) the insertion 
of a drug resistance cassette in the middle of the S-layer gene or the substitution of a 
version of the S-layer gene which has had a sizeable internal region deleted from the 
gene (see: Bingie et ah 1997' [supra]; Bingle et al. 1997 2 "Cell Surface Display of a 

20 Pseudonomonas aerugenosa PAK Pilin Peptide with the Paracrystalline Layer of 
Caulobacter crescentus " Molec. Microbiol. 26:277-288; and, Edwards and Smit (1991) 
" A Transducing Bacteriophage for Caulobacter us Uses the Paracrystalline Surface 
Layer Protein as a Receptor" J. Bacterid. 173: 5568-5572). In the case of a genetic 
manipulation, a common method for producing such strains is to modify a copy of the 

25 S-layer gene while on a plasmid and then to use well known gene replacement methods 
to substitute the modified gene for the native gene in the Caulobacter chromosome (see: 
Edwards and Smit (1991) [supra]). 

If an entire S-layer gene is to be used for production of a recombinant protein 
(via insertion of a target sequence), strains defective in the production of the 

30 lipopolysacharide (LPS) used for S-layer attachment to the bacterial surface can be 
used. These can be prepared by forcing Caulobacter to grow without exogenous 
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calcium. Under these conditions mutants arise that are uniformly defective in 
producing a proficient version of the S-layer LPS (see; Walker, S.G. et al. (1994) 
"Characteristics of Mutants of Caulobacter crescentus Defective in Surface Attachment 
of the Paracrystaline Layer" J. Bacterid. 176: 6312-6323). 
5 All Caulobacter S-layer producing strains are suitable for this technology. One 

may isolate the S-layer gene from a particular strain (using homology between 
Caulobacter S-layers to design probes to detect and clone the S-layer genes) and adapt 
the C-terminal region for recombinant protein expression, in a manner similar to that 
done for C. crescentus strains (see: MacRae and Smit (1991) [supra], and Walker, S.G. 

10 et al. (1992) [supra]). Alternatively, one may construct recombinant fusion S-layer 
genes using the C. crescentus S-layer gene and express the recombinant genes in 
alternate Caulobacter hosts. 

Freshwater Caulobacter producing S-layers may be readily detected by negative 
stain transmission electron microscopy techniques. Caulobacter may be isolated using 

15 the methods outlined by MacRae and Smit (1991) [supra], which take advantage of the 
fact that Caulobacter can tolerate periods of starvation while other soil and water 
bacteria may not and that they all produce a distinctive stalk structure, visible by light 
microscopy (using either phase contrast or standard dye staining methods). Once 
Caulobacter strains are isolated in a typical procedure, colonies may be suspended in 

2 0 2% ammonium molybdate negative stain and applied to plastic-filmed, carbon-stabilized 
300 or 400 mesh copper or nickel grids and examined in a transmission electron 
microscope at 60 kilovolt accelerating voltage (see: Smit, J. (1986) "Protein Surface 
Layers of Bacteria", in Outer Membranes as Model Systems , (M. Inouge, ed. J.Wiley 
& Sons, at p. 343-376). S-layers are seen as two-dimensional geometric patterns most 

2 5 readily on those cells in a colony that have lysed and released their internal contents. 

Recombinant Protein Purification 

Secreted proteins are separated and shed into the culture media as a macroscopic 
30 precipitate (the "aggregate" referred to herein). The shedding phenomenon is a 
consequence of the absence of the N-terminal region of the S-layer protein in the 
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expressed recombinant protein, or the loss of the lipopolysaccharide species used for S- 
layer attachment by the Caulobacter (see: Walker, S.G. et at. (1994) [supra]). 
Typically, the aggregate forms as loose, gel-Like lumps of pure protein that can readily 
be retrieved and separated from the bacteria by simple filtration. 
5 The aggregate may be readily separated from a soluble cleaved target protein by 

any suitable techniques such as filtration of centrifugation. If the target protein is 
insoluble once cleaved, it may then be convenient to then solubilize one or both of the 
proteins (for example in 8M urea or 6M quanidine HCL) and separate by 
chromatography. In this way, only 2 species of protein need to be separated. 

10 

Cleavage of Fusion Proteins 

General procedures for performing mild-acid cleavage are known from in the 
prior art as described above. In the method of this invention, conditions are adjusted to 
15 avoid destruction of the target protein or solubilization of the aggregate containing the 
S-layer secretion signal. Excess acid or too high a temperature may increase the 
occurrence over time of random cleavages along the length of the fusion protein, which 
is to be avoided since such random cleavages may lead to undersized fragmentation of 
the fusion protein or solubilization of the aggregated S-layer portion. 

20 

Good yields of target protein with minimum random breaks in the fusion protein 
may generally be achieved by using from 5-20 mM HCL (or its equivalent while 
employing another acid). The respective pH of these conditions (unbuffered acid 
solution) is from about 2.3 to about 1.7. Time and temperature is preferably adjusted 

25 by routine monitoring to achieve the desired cleavage while minimizing random breaks. 
For example, temperature may range from room temperature to about 50° C. Time of 
treatment may range from about 12 to about 72 hours. Time or temperature outside of 
these ranges is permissible depending upon the strength of the acid and the accepted 
yield. Generally, lower yields are obtained with less acid strength, less time or lower 

30 temperatures. 

In the following examples, efficiency of cleavage in the order of 40-80% is 
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achieved using conditions the same as or similar to the following alternatives: 

- 5 mM HCL at 50° C. for 48-72 hours 

- 20 mM HCL at 30° C. for 48-72 hours. 

Conditions in excess of the aforementioned values may be employed in some 
5 cases with the possibility of random breaks increasing, particularly with increased acid 
strength or temperature. In the following examples, significant random cleavage 
occurred with 50 mM HCL at 50° C. after 48 hours. 

Any acid may be employed in this invention which is normally used in solutions 
to which proteins are exposed. Acids which have a deleterious effect on proteins under 
10 dilute conditions should be avoided. For example, HCL or an equivalent amount of 
HjSQ, may be used in this invention but oxidizing acids such as nitric acid may not be 
suitable. 

Example ! ♦ Cleavage of artificial silk protein sequences 
15 from a secretion signal containing a native aspartate-proline cleavage site. 

An artificial protein sequence resembling spider silk was constructed by 
synthesis of partially overlapping and complementing oligomers of DNA, which were 
then completed to a full duplex DNA with Taql polymerase extension, to create a 

20 sequence that coded for 97 amino acids. The resulting DNA sequence and 
corresponding amino acid sequence are shown in Appendix 2. 

The DNA sequence shown in Appendix 2 was cloned into a gene carrier 
sequence residing in a pUC8 plasmtd cloning vector. The gene segment carrier had 
BamHl restriction sites at each end and an internal Bgm site. This combination of 

25 restrictions sites allowed the production of multimers of the above sequence, relying on 
the fact that BamHl sticky ends will ligate into Bgm sticky end, with the loss of both 
restriction sites. Thus one copy of the silk-like sequence within the gene segment 
carrier can be put inside a second copy of the same to produce a dimer. Using this 
principle, an 8X repeat was produced, fused to DNA encoding the S-layer secretion 

3 0 signal corresponding to the C-terminal portion of the C. crescentus S-layer protein from 
about amino acid 690 onwards (see: Appendix 1). This fusion protein gene was 
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introduced into strain CB2A on a broad host range plasmid vector. The 8x multimer 
appeared to be unstable, resulting in recombination events that reduced the 8X multimer 
to a 3x size. The 3 fold repeat of the above 97 amino acid sequence, fused to the S- 
layer secretion signal was secreted. Protein was collected and subjected to treatment 
5 with 5mM HCL for 2 days at 50° C. The result was the liberation of about 80% of 
soluble silk-like polymer which was readily separated by filtration from the S-layer 
protein which remained completely aggregated under these conditions. Cleavage 
occurred at native aspartate-proline dimer in the Caulobacter S-layer signal region (see: 
Appendix 1, amino acids numbered 692-693). 

10 

Example 2. Cleavage of the salmonid virus Infectious Pancreatic Necrosis 
Virus (IPNV) surface glycoprotein candidate vaccine sequence from an 
S-layer secretion signal containing a native aspartate-proline site. 

15 

The surface glycoprotein of the IPNV strain is a vaccine candidate. For this 
example and Example 4, the sequence of the first 257 amino acids of the mature protein 
and the corresponding DNA sequence as shown in Appendix 3 were used. 

DNA encoding a segment of the major surface glycoprotein gene of IPNV 

20 specifying amino acids 145-257 of the protein was fused to DNA sequence specifying 
two putative T-cell activating epitopes; MVF (SEQ ID No:L; LSEIKG VTVHRLEGV , 
derived from Measles Virus protein F) and P2 (SEQ ID No:2; QYIKANSKFIGITEL, 
derived from tetanus toxoid protein). The T-cell epitopes were positioned on the C- 
terminal end of the IPNV sequence. This chimeric protein was in turn fused in frame 

25 with the C-crescentus S-layer gene at about amino acid 690 position of the gene and 
introduced into Caulobacter on a broad host range plasmid vector. The resulting 
secreted protein was collected and treated with 5 mM HCL for 2 days at 50° C. 
Cleavage occurred at the native aspartate-proline dimer described in Example 1. The 
result was the liberation of about 75% of soluble vaccine candidate chimeric protein 

3 0 from the S-layer secretion signal which remained aggregated. 
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Example 3. Cleavage of segments of an E. coli type I pilus tip subunit from 
an S-layer secretion signal containing a native aspartate-proline cleavage site. 

5 The FimH gene product is the tip pilus subunit of the E. coli strains involved 

with urinary tract infections. Two segments, T3 (specifying the first 145 amino acids 
of the mature peptide) and T7 (specifying the entire 258 amino acids of the mature 
peptide) were fused to the S-layer secretion signal at about amino acid 690 of the 
S-layer sequence. The T3 and T7 sequences are shown in Appendix 4. 

10 The fusion protein genes were introduced into strain CB2A on a broad host 

range plasmid vector. In both cases the resulting secreted protein was collected and 
treated with 5 mM HCL for 2 days at 50° C. In both cases, the result was the liberation 
of about 50% of soluble vaccine candidate chimeric protein from the S-layer secretion 
signal which remained aggregated. Cleavage occurred at the native aspartate-proline 

15 dimer described in Example 1. 

Example 4 . Cleavage of the salmonid virus IPNV surface glycoprotein 
candidate vaccine sequence from an S-layer secretion signal containing 
an introduced aspartate-proline cleavage site. 

20 

A segment of the major surface glycoprotein gene of IPNV specifying amino 
acids 1-257 of the protein shown in Appendix 4 was fused to a DNA sequence 
specifying a peptide containing an aspartate-proline dipeptide (SEQ ID No: 3; 
SPLGPAGDPEAS) such that the aspartate-proline dipeptide was positioned very near 

25 the C-tenriinus of the chimeric protein. This chimeric protein was in turn fiised in 
frame with the C. crescentus S-layer gene at about amino acid 784 position of the gene 
and introduced in strain CB2A on a broad host range plasmid vector. The resulting 
secreted protein was collected and treated with 5 mM HCL for 2 days at 50 q C. 
Cleavage occurred at the introduced aspartate-proline dipeptide. The result was the 

30 liberation of about 40% of insoluble vaccine candidate chimeric protein from the S- 
layer secretion signal which remained aggregated. 

Longer DNA and amino acid sequences referred to above are set out in the 
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following Appendices which are part of this description. Appendix 1 sets out the 
complete nucleotide sequence of the C. crescentus S-layer gene (SEQ ID No: 4) with 
the upstream sequence including the -35 and -10 sites of the promoter region and the 
Shine Dalgarno sequence. The start codon is at nucleotide 101 and the coding sequence 
5 run to and includes nucleotide 3179. The amino acid sequence of the C. crescentus S- 
layer protein (SEQ ID No: 5) included in Appendix 1 is predicted from the DNA 
sequence. Appendix 2 sets out the artificial spider silk DNA sequence (SEQ ID No:6) 
used in Example 1 and the corresponding amino acid sequence (SEQ ID No. 7). 
Appendix 3 sets out the DNA sequence (SEQ ID No: 8) and corresponding amino acid 

10 sequence (SEQ ID No: 9) of the first 257 amino acids of IPNV as described in 
Examples 2 and 4. Appendix 4 sets out the T3 protein sequence (SEQ ID No: 10) and 
the T7 protein sequence (SEQ ID No: 11) as described in Example 3. 

All publications, patents and patent applications referred to herein are hereby 
incorporated by reference. While this invention has been described according to 

15 particular embodiments and by reference to certain examples, it will be apparent to 
those of skill in the art that variations and modifications of the invention as described 
herein fall within the spirit and scope of the attached claims. 
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GCTATTGTCG ACGTATGACG TTTGCTCTAT AGCCATCGCT GCTCCCATGC GCGCCACTCG 60 

GTCGCAGGGG GTGTGGGATT TTTTTTGGGA GACAATCCTC ATGGCCTATA CGACGGCCCA 120 

GTTGGTGACT GCGTACACCA ACGCCAACCT CGGCAAGGCG CCTGACGCCG CCACCACGCT 180 

GACGCTCGAC GCGTACGCGA CTCAAACCCA GACGGGCGGC CTCTCGGACG CCGCTGCGCT 240 

GACCAACACC CTGAAGCTGG TCAACAGCAC GACGGCTGTT GCCATCCAGA CCTACCAGTT 300 

CTTCACCGGC GTTGCCCCGT CGGCCGCTGG TCTGGACTTC CTGGTCGACT CGACCACCAA 360 

CACCAACQAC CTGAACGACG CGTACTACTC GAAGTTCGCT CAGGAAAACC GCTTCATCAA 420 

CTTCTCGATC AACCTGGCCA CGGGCGCCGG CGCCGGCGCG ACGGCTTTCG CCGCCGCCTA 480 

CACGGGCGTT TCGTACGCCC AGACGGTCGC CACCGCCTAT GACAAGATCA TCGGCAACGC 540 

CGTCGCGACC GCCGCTGGCG TCGACGTCGC GGCCGCCGTG GCTTTCCTGA GCCGCCAGGC 600 

CAACATCGAC TACCTGACCG CCTTCGTGCG CGCCAACACG CCGTTCACGG CCGCTGCCGA 660 

CATCGATCTG GCCGTCAAGG CCGCCCTGAT CGGCACCATC CTGAACGCCG CCACGGTGTC 720 

GGGCATCGGT GGTTACGCGA CCGCCACGGC CGCGATGATC AACGACCTGT CGGACGGCGC 780 

CCTGTCGACC GACAACGCGG CTGGCGTGAA CCTGTTCACC GCCTATCCGT CGTCGGGCGT 840 

GTCGGGTTCG ACCCTCTCGC TGACCACCGG CACCGACACC CTGACGGGCA CCGCCAACAA 900 

CGACACGTTC GTTGCGGGTG AAGTCGCCGG CGCTGCGACC CTGACCGTTG GCGACACCCT 960 

GAGCGGCGGT GCTGGCACCG ACGTCCTGAA CTGGGTGCAA GCTGCTGCGG TTACGGCTCT 1020 

GCCGACCGGC GTGACGATCT CGGGCATCGA AACGATGAAC GTGACGTCGG GCGCTGCGAT 1080 

CACCCTGAAC ACGTCTTCGG GCGTGACGGG TCTGACCGCC CTGAACACCA ACACCAGCGG 1140 

CGCGGCTCAA ACCGTCACCG CCGGCGCTGG CCAGAACCTG ACCGCCACGA CCGCCGCTCA 12 00 

AGCCGCGAAC AACGTCGCCG TCGACGGGCG CGCCAACGTC ACCGTCGCCT CGACGGGCGT 1260 

GACCTCGGGC ACGACCACGG TCGGCGCCAA CTCGGCCGCT TCGGGCACCG TGTCGGTGAG 1320 

CGTCGCGAAC TCGAGCACGA CCACCACGGG CGCTATCGCC GTGACCGGTG GTACGGCCGT 1380 

GACCGTGGCT CAAACGGCCG GCAACGCCGT GAACACCACG TTGACGCAAG CCGACGTGAC 1440 

CGTGACCGGT AACTCCAGCA CCACGGCCGT GACGGTCACC CAAACCGCGG CCGCCACGGC 1500 

CGGCGCTACG GTCGCCGGTC GCGTCAACGG CGCTGTGACG ATCACCGACT CTGCCGCCGC 1560 

CTCGGCCACG ACCGCCGGCA AGATCGCCAC GGTCACCCTG GGCAGCTTCG GCGCCGCCAC 1620 

GATCGACTCG AGCGCTCTGA CGACCGTCAA CCTGTCGGGC ACGGGCACCT CGCTCGGCAT 1680 
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CGGCCGCGGC 


GCTC7GACCG 


CCACGCCGAC 


CGCCAACACC 


CTGACCCTGA 


ACGTCAATGG 


1740 


TCTGACGACG 


ACCGGCGCGA 


TCACGGACTC 


GGAAGCGGCT 


GCTGACGATG 


GTTTCACCAC 


1800 


CATCAACATC 


GCTGGTTCGA 


CCGCCTCTTC 


GACGATCGCC 


AGCCTGGTGG 


CCGCCGACGC 


I860 


GACGACCCTG 


AACATCTCGG 


GCGACGCTCG 


CGTCACGATC 


ACCTCGCACA 


CCGCTGCCGC 


1920 


CCTGACGGGC 


ATCACGGTGA 


CCAACAGCGT 


TGGTGCGACC 


CTCGGCGCCG 


AACTGGCGAC 


1980 


CGGTCTGGTC 


TTCACGGGCG 


GCGCTGGCCG 


TGACTCGATC 


CTGCTGGGCG 


CCACGACCAA 


2040 


GGCGATCGTC 


ATGGGCGCCG 


GCGACGACAC 


CGTCACCGTC 


AGCTCGGCGA 


CCCTGGGCGC 


2100 


TGGTGGTTCG 


GTCAACGGCG 


GCGACGGCAC 


CGACGTTCTG 


GTGGCCAACG 


TCAACGGTTC 


2160 


GTCGTT CAGC 


GCTGACCCGG 


CCTTCGGCGG 


CTTCGAAACC 


CTCCGCGTCG 


CTGGCGCGGC 


2220 


GGCTCAAGGC 


TCGCACAACG 


CCAACGGCTT 


CACGGCTCTG 


CAACTGGGCG 


CGACGGCGGG 


2280 


TGCGACGACC 


TTCACCAACG 


TTGCGGTGAA 


TGTCGGCCTG 


ACCGTTCTGG 


CGGCTCCGAC 


2340 


CGGTACGACG 


ACCGTGACCC 


TGGCCAACGC 


CACGGGCACC 


TCGGACGTGT 


TCAACCTGAC 


2400 


CCTGTCGTCC 


TCGGCCGCTC 


TGGCCGCTGG 


TACGGTTGCG 


CTGGCTGGCG 


TCGAGACGGT 


2460 


GAACATCGCC 


GCCACCGACA 


CCAACACGAC 


CGCTCACGTC 


GACACGCTGA 


CGCTGCAAGC 


2520 


CACCTCGGCC 


AAGTCGATCG 


TGGTGACGGG 


CAACGCCGGT 


CTGAACCTGA 


CCAACACCGG 


2580 


CAACACGGCT 


GTCACCAGCT 


TCGACGCCAG 


CGCCGTCACC 


GGCACGGCTC 


CGGCTGTGAC 


2640 


CTTCGTGTCG 


GCCAACACCA 


CGGTGGGTGA 


AGTCGTCACG 


ATCCGCGGCG 


GCGCTGGCGC 


2700 


CGACTCGCTG 


ACCGGTTCGG 


CCACCGCCAA 


TGACACCATC 


ATCGGTGGCG 


CTGGCGCTGA 


2760 


CACCCTGGTC 


TACACCGGCG 


GTACGGACAC 


CTTCACGGGT 


GGCACGGGCG 


CGGATATCTT 


2820 


CGATATCAAC 


GCTATCGGGA. 


CCTCGACCGC 


TTTCGTGACG ATCACCGACG 


l_ tAiC l\3 rtTGG 


2880 


CGACAAGCTC 


GACCTCGTCG 


GCATCTCGAC 


GAACGGCGCT 


ATCGCTGACG 


GCGCCTTCGG 


2940 


CGCTGCGGTC 


ACCCTGGGCG 


CTGCTGCGAC 


CCTGGCTCAG 


TACCTGGACG 


CTGCTGCTGC 


3000 


CGGCGACGGC 


AGCGGCACCT 


CGGTTGCCAA 


GTGGTTCCAG 


TTCGGCGGCG 


ACACCTATGT 


3060 


CGTCGTTGAC 


AGCTCGGCTG 


GCGCGACCTT 


CGTCAGCGGC 


GCTGACGCGG 


TGATCAAGCT 


3120 


GACCGGTCTG 


GTCACGCTGA 


CCACCTCGGC 


CTTCGCCACC GAAGTCCTGA 


CGCTCGCCTA 


3180 


AGCGAACGTC 


TGATCCTCGC 


CTAGGCGAGG 


ATCGCTAGAC 


TAAGAGACCC 


CGTCTTCCGA 


3240 


AAGGGAGGCG 


GGGTCTTTCT 


TATGGGCGCT 


ACGCGCTGGC 


CGGCCTTGCC 


TAGTTCCGGT 


3300 
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Met Ala Tyr Thr Thr Ala Gin Leu Val Thr Ala Tyr Thr Asn Ala Asn 
15 10 15 

Leu Gly Lys Ala Pro Asp Ala Ala Thr Thr Leu Thr Leu Asp Ala Tyr 
20 25 30 

Ala Thr Gin Thr Gin Thr Gly Gly Leu Ser Asp Ala Ala Ala Leu Thr 
35 40 45 

Asn Thr Leu Lys Leu Val Asn Ser Thr Thr Ala Val Ala lie Gin Thr 
50 55 eo 

Tyr Gin Phe Phe Thr Gly Val Ala Pro Ser Ala Ala Gly Leu Asp Phe 
65 70 75 80 

Leu Val Asp Ser Thr Thr Asn Thr Asn Asp Leu Asn Asp Ala Tyr Tyr 
85 90 95 

Ser Lys Phe Ala Gin Glu Asn Arg Phe He Asn Phe Ser lie Asn Leu 
100 105 HO 

Ala Thr Gly Ala Gly Ala Gly Ala Thr Ala Phe Ala Ala Ala Tyr Thr 
115 120 125 

Gly Val Ser Tyr Ala Gin Thr Val Ala Thr Ala Tyr Asp Lys He He 
130 135 140 

Gly Asn Ala Val Ala Thr Ala Ala Gly Val Asp Val Ala Ala Ala Val 
145 150 155 160 

Ala Phe Leu Ser Arg Gin Ala Asn He Asp Tyr Leu Thr Ala Phe Val 
165 170 175 

Arg Ala Asn Thr Pro Phe Thr Ala Ala Ala Asp He Asp Leu Ala Val 
180 185 190 

Lys Ala Ala Leu lie Gly Thr He Leu Asn Ala Ala Thr Val Ser Gly 
195 200 205 

He Gly Gly Tyr Ala Thr Ala Thr Ala Ala Met He Asn Asp Leu Ser 
210 215 220 

Asp Gly Ala Leu Ser Thr Asp Asn Ala Ala Gly Val Asn Leu Phe Thr 
225 230 235 240 

Ala Tyr Pro Ser Ser Gly Val Ser Gly Ser Thr Leu Ser Leu Thr Thr 
245 250 255 

Gly Thr Asp Thr Leu Thr Gly Thr Ala Asn Asn Asp Thr Phe Val Ala 
260 265 270 

Gly Glu Val Ala Gly Ala Ala Thr Leu Thr Val Gly Asp Thr Leu Ser 
275 280 285 

Gly Gly Ala Gly Thr Asp Val Leu Asn Trp Val Gin Ala Ala Ala Val 
290 295 300 

Thr Ala Leu Pro Thr Gly Val Thr He Ser Gly He Glu Thr Met Asn 
30S 310 315 320 

Val Thr Ser Gly Ala Ala lie Thr Leu Asn Thr Ser Ser Gly Val Thr 
325 330 335 

Gly Leu Thr Ala Leu Asn Thr Asn Thr Ser Gly Ala Ala Gin Thr Val 

340 345 350 
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Thr Ala Gly Ala Gly Gin Asn Leu Thr Ala Thr Thr Ala Ala Gin Ala 
355 360 365 

Ala Asn Aan Val Ala Val Asp Gly Arg Ala Asn Val Thr Val Ala Ser 
370 375 380 

Thr Gly Val Thr Ser Gly Thr Thr Thr Val Gly Ala Asn Ser Ala Ala 
385 390 395 

Ser Gly Thr Val Ser Val Ser Val Ala Asn Ser Ser Thr Thr Thr Thr 
405 410 

Gly Ala lie Ala Val Thr Gly Gly Thr Ala Val Thr Val Ala Gin Thr 
420 425 * JU 

Ala Gly Asn Ala Val Asn Thr Thr Leu Thr Gin Ala Asp Val Thr Val 
1 435 440 445 

Thr Gly Asn Ser Ser Thr Thr Ala Val Thr Val Thr Gin Thr Ala Ala 
450 455 

Ala Thr Ala Gly Ala Thr Val Ala Gly Arg Val Asn Gly Ala Val Thr 



465 



470 475 4 80 



lie Thr Asp Ser Ala Ala Ala Ser Ala Thr Thr Ala Gly Lys lie Ala 
485 490 

Thr Val Thr Leu Gly Ser Phe Gly Ala Ala Thr lie Asp Ser Ser Ala 
500 505 510 

Leu Thr Thr Val Asn Leu Ser Gly Thr Gly Thr Ser Leu Gly He Gly 
515 520 

Arq Gly Ala Leu Thr Ala Thr Pro Thr Ala Asn Thr Leu Thr Leu Asn 
530 535 540 

Val Asn Gly Leu Thr Thr Thr Gly Ala He Thr Asp Ser Glu Ala Ala 
545 550 555 

Ala Asp Asp Gly Phe Thr Thr He Asn He Ala Gly Ser Thr Ala Ser 

v 565 570 575 

Ser Thr He Ala Ser Leu Val Ala Ala Asp Ala Thr Thr Leu Asn lie 

580 585 550 

Ser Gly Asp Ala Arg Val Thr He Thr Ser His Thr Ala Ala Ala Leu 
* 595 600 60S 

Thr Gly He Thr Val Thr Asn Ser Val Gly Ala Thr Leu Gly Ala Glu 
610 615 620 

Leu Ala Thr Gly Leu Val Phe Thr Gly Gly Ala Gly Arg Asp Ser lie 
625 630 635 640 

Leu Leu Gly Ala Thr Thr Lye Ala He Val Met Gly Ala Gly Asp Asp 
645 650 65b 

Thr Val Thr Val Ser Ser Ala Thr Leu Gly Ala Qly Gly Ser Val Asn 
660 665 670 

Gly Gly Asp Gly Thr Asp Val Leu Val Ala Asn Val Asn Gly Ser Ser 
675 680 685 

Phe Ser Ala Asp Pro Ala Phe Gly Gly Phe Glu Thr Leu Arg Val Ala 
690 695 ™0 
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Gly Ala Ala Ala Gin Gly Ser His Asn Ala Asn Gly Phe Thr Ala Leu 
705 710 715 720 

Gin Leu Gly Ala Thr Ala Gly Ala Thr Thr Phe Thr Asn Val Ala Val 
725 730 735 

Asn Val Gly Leu Thr Val Leu Ala Ala Pro Thr Gly Thr Thr Thr Val 

740 745 750 

Thr Leu Ala Asn Ala Thr Gly Thr Ser Asp Val Phe Asn Leu Thr Leu 
755 760 765 

Ser Ser Ser Ala Ala Leu Ala Ala Gly Thr Val Ala Leu Ala Gly Val 
770 775 780 

Glu Thr Val Asn He Ala Ala Thr Asp Thr Asn Thr Thr Ala His Val 
785 790 795 800 

Asp Thr Leu Thr Leu Gin Ala Thr Ser Ala Lys Ser He Val Val Thr 
80S 810 815 

Gly A8n Ala Gly Leu Asn Leu Thr Asn Thr Gly Asn Thr Ala Val Thr 
820 825 830 

Ser Phe Asp Ala Ser Ala Val Thr Gly Thr Ala Pro Ala Val Thr Phe 
835 840 845 

Val Ser Ala Asn Thr Thr Val Gly Glu Val Val Thr He Arg Gly Gly 
850 855 860 

Ala Glv Ala Asp Ser Leu Thr Gly Ser Ala Thr Ala Asn Asp Thr He 
865 870 875 880 

He Gly Gly Ala Gly Ala Asp Thr Leu Val Tyr Thr Gly Gly Thr Asp 
885 890 895 

Thr Phe Thr Gly Gly Thr Gly Ala Asp He Phe Asp He Asn Ala He 
900 905 910 

Gly Thr Ser Thr Ala Phe Val Thr He Thr Asp Ala Ala Val Gly Asp 
915 920 92S 

Lys Leu Asp Leu Val Gly He Ser Thr Asn Gly Ala He Ala Asp Gly 
930 935 940 

Ala Phe Gly Ala Ala Val Thr Leu Gly Ala Ala Ala Thr Leu Ala Gin 
945 950 955 960 

Tyr Leu Asp Ala Ala Ala Ala Gly Asp Gly Ser Gly Thr Ser Val Ala 
965 970 975 

Lye Trp Phe Gin Phe Gly Gly Asp Thr Tyr Val Val Val Asp Ser Ser 
980 985 990 

Ala Gly Ala Thr Phe Val Ser Gly Ala Asp Ala Val He Lys Leu Thr 
995 1000 1005 

Gly Leu Val Thr Leu Thr Thr Ser Ala Phe Ala Thr Glu Val Leu Thr 
1010 1015 1020 

Leu Ala 
1025 
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GAA TTC AGA TCT CAG GGC GCG GGG CAG GGT GGC TAT GGT GGG CTC GGC 

TCG CAA GGC 

GCT 

EFRSQGAGQGGYGGLGSQGA 

GGC CTG GGT GGC CAG GGC GCT GGC GCG GCC GCG GCC GCT GCG GCC GGT 
GGC 

GRGGQGAGAAAAAAAGG 

GCT GGC CAG GGC GGG CTG GGC TCG CAG GGC GCC GGC CAA GGC GCT GGC 

GCC GCG GCC 

GCT 

AGQGGLGSQGAGQGAGAAAA 

GCG GCC GGT GGC GCC GGC CAG GGT GGC TAC GGC GGC CTG GGC AGC CAG 

GGC GCC GGT 

CGC 

AAGGAGQGGYGGLGSQGAGR 

GGC GGT CAG GGC GCC GGT GCC GCG GCC GCT GCG GCC GGT GGC GCT GGG 
CAA GGC GGC TAC 

GGQGAGAAAAAAGGAGQGGY 

GGC GGT CTG GGA TCC 
G G L G S 
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atg aac aca aac aag gca acc gca act tac ttg aaa tec att atg ctt cca gag ad 

8?et asn thr asn lys ala thr afa thr tyr leu tys ser ile met leu pro glu thr 
giy 

61/21 

cca gca age ate ccg gac gac ata acg gag aga cac ate tta aaa caa gag acc teg 
tea 

pro ala ser iie pro asp asp ile thr glu arg his He leu lys gin glu thr ser 

ser 

121/41 

tac aac tta gag gtc tec gaa tea gga agt ggc att ctt gtt tgt ttc cct ggg gca 

tyr asn leu gib val ser glu ser gly ser gly ile leu val cys phe pro gly ala 

pro 

181/61 

ggc tea egg ate ggt gca cac tac aga tgg aat grg aac cag acg ggg ctg gag ttc 

gfy°ser arg ile gly ala his tyr arg trp asn ala asn gin thr gly leu glu phe 

asp 

241/81 

cag tgg ctg gag acg teg cag gac ctg aag aaa gec ttc aac tac ggg agg ctg ate 

tea i 
gin trp leu glu thr ser gin asp leu lys lys ala phe asn tyr gly arg leu He 

ser 

301/101 

agg aaa tac gac att caa age tec aca eta ccg gec ggt etc tat get ctg aac ggg 

arg lys tyr asp ile gin ser ser thr leu pro ala gly leu tyr ala leu asn gly 
thr 

361/121 

etc aac get gee acc ttc gaa ggc agt ctg tct gag gtg gag age ctg acc tac aat 
age 

leu asn ala ala thr phe glu gly ser leu ser glu val glu ser leu thr tyr asn 

ser 

421/141 

ctg atg tec eta act acg aac ccc cag gac aaa gee aac aac cag ctg gtg acc aaa 
gga 

leu met ser leu thr thr asn pro gtn asp lys ala asn asn gin leu val thr lys 
g'y 

481/161 

gtc acc gtc ctg eat eta cca aca ggg ttc gac aaa cca tac gtc cgc eta gag gac 
gag 

val thr val leu asn leu pro thr gly phe asp lys pro tyr val arg leu glu asp 
glu 

541/181 

aca ccc cag ggt etc cag tea atg aac ggg gee agg atg agg tgc aca get gca att 
thrjjro gin gly leu gin ser mat asn gly ala arg met arg cys thr ala ala ile 
601/201 

cca egg agg tac gag ate gac etc cca tec caa age eta ccc ccc gtt cct gcg aca 
gga 

pro arg arg tyr glu ile asp leu pro ser gin ser leu pro pro val pro ala thr 
§6^/221 

acc etc acc act etc tac gag gga aac gec gac ate gtc age tec aca aca gtg acg 
tr?Meu thr thr leu tyr glu gly asn ata asp ile val ser ser thr thr val thr 
¥£1/241 

gac ata aac ttc agt ctg gca gaa cga ccc gca aac gag acc agg ttc gac ttc cag 
ctg 

asp ile asn phe ser leu ala glu arg pro ala asn glu thr arg phe asp phe gin 
leu 
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CDVSA 

CDVSA^D^^ 

GAVGTS AVS LG LTAN Y ARTGGQ VTAGN VQ S H GVTFVYQ 
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WHAT IS CLAIMED IS: 



1 . A method of cleaving a fusion protein including a first component which comprises all 
or part of a Caulobacter S layer protein including a Caulobacter C-tenninal secretion 

5 signal, and a second component heterologous to Caulobacter, the fusion protein 

containing at least one aspartate-proline dipeptide, wherein the method comprises 
combining the fusion protein with an acid solution of a strength insufficient to 
solubilize the fusion protein for a time sufficient for cleavage of the fusion protein at 
said aspartate-proline dipeptide. 

10 

2. The method of claim 1 wherein a aspartate-proline dipeptide is situated between the 
first and second components or adjacent a junction between the first and second 
components. 

15 3. The method of claim 1 or 2, wherein the acid solution has a pH of from about 1.5 to 
about 2.5. 

4. The method of claim 1 or 2, wherein the acid solution has a pH of about 1 .65 to about 
2.35. 

20 

5. The method of any one of claims 1-4 wherein the method is carried out at a 
temperature in the range of about 30° C. to about 50° C. 

6. The method of any one of claims 1-5, wherein the method further comprises 
2 5 separating products cleaved from the fusion protein. 

7. A method of preparing a DNA construct for expression of a fusion protein suitable for 
use in the method of claim 1, wherein the method comprises joining an upstream 
DNA segment including DNA heterologous to Caulobacter which encodes a protein 
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of interest, to a downstream DNA segment including DNA for a Caulobacter C- 
terminai secretion signal, wherein the downstream DNA segment does not encode an 
aspartate-proline dipeptide, and wherein the upstream segment contains DNA 
encoding an aspartate-proline dipeptide at or near an end of said upstream segment to 
5 be joined to said downstream segment. 

8. A method of preparing a fusion protein, comprising: 

(1) expressing a DNA construct prepared as described in claim 7 in 
Caulobacter and, 

10 

(2) recovering said fusion protein secreted by the Caulobacter. 
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